<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
  <meta charset="utf-8" /><meta name="generator" content="Docutils 0.17.1: http://docutils.sourceforge.net/" />

  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Inverter (Conjugate Gradient) &mdash; SIMULATeQCD 0.7 documentation</title>
      <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
      <link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
      <link rel="stylesheet" href="../_static/togglebutton.css" type="text/css" />
      <link rel="stylesheet" href="../_static/custom.css" type="text/css" />
  <!--[if lt IE 9]>
    <script src="../_static/js/html5shiv.min.js"></script>
  <![endif]-->
  
        <script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
        <script src="../_static/jquery.js"></script>
        <script src="../_static/underscore.js"></script>
        <script src="../_static/doctools.js"></script>
        <script src="../_static/togglebutton.js"></script>
        <script>var togglebuttonSelector = '.toggle, .admonition.dropdown';</script>
        <script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
        <script>window.MathJax = {"options": {"processHtmlClass": "tex2jax_process|mathjax_process|math|output_area"}}</script>
    <script src="../_static/js/theme.js"></script>
    <link rel="index" title="Index" href="../genindex.html" />
    <link rel="search" title="Search" href="../search.html" />
    <link rel="next" title="Correlator Class" href="correlator.html" />
    <link rel="prev" title="Modules" href="modules.html" /> 
</head>

<body class="wy-body-for-nav"> 
  <div class="wy-grid-for-nav">
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search"  style="background: #343131" >
            <a href="../index.html" class="icon icon-home"> SIMULATeQCD
          </a>
<div role="search">
  <form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>
        </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
              <ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../01_gettingStarted/gettingStarted.html">Getting started</a></li>
<li class="toctree-l1"><a class="reference internal" href="../02_contributions/contributions.html">Contributions</a></li>
<li class="toctree-l1"><a class="reference internal" href="../03_applications/applications.html">Applications</a></li>
<li class="toctree-l1"><a class="reference internal" href="../04_codeBase/codeBase.html">Code base</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="modules.html">Modules</a><ul class="current">
<li class="toctree-l2 current"><a class="current reference internal" href="#">Inverter (Conjugate Gradient)</a></li>
<li class="toctree-l2"><a class="reference internal" href="correlator.html">Correlator Class</a></li>
<li class="toctree-l2"><a class="reference internal" href="gaugeUpdates.html">Gauge Updates (HB and OR)</a></li>
<li class="toctree-l2"><a class="reference internal" href="dslash.html">Dslash</a></li>
<li class="toctree-l2"><a class="reference internal" href="integrator.html">Integrator</a></li>
<li class="toctree-l2"><a class="reference internal" href="randomNumbers.html">Random Number Generator</a></li>
<li class="toctree-l2"><a class="reference internal" href="gaugeSmearing.html">HISQ smearing</a></li>
</ul>
</li>
</ul>

        </div>
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu"  style="background: #343131" >
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="../index.html">SIMULATeQCD</a>
      </nav>

      <div class="wy-nav-content">
        <div class="rst-content">
          <div role="navigation" aria-label="Page navigation">
  <ul class="wy-breadcrumbs">
      <li><a href="../index.html" class="icon icon-home"></a> &raquo;</li>
          <li><a href="modules.html">Modules</a> &raquo;</li>
      <li>Inverter (Conjugate Gradient)</li>
      <li class="wy-breadcrumbs-aside">
            <a href="../_sources/05_modules/inverter.md.txt" rel="nofollow"> View page source</a>
      </li>
  </ul>
  <hr/>
</div>
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
             
  <section class="tex2jax_ignore mathjax_ignore" id="inverter-conjugate-gradient">
<h1>Inverter (Conjugate Gradient)<a class="headerlink" href="#inverter-conjugate-gradient" title="Permalink to this headline"></a></h1>
<p>The Conjugate Gradient (CG) inverter solves the equation <span class="math notranslate nohighlight">\(Ax=b\)</span>, where <span class="math notranslate nohighlight">\(A\)</span> is a symmetric and positive definite matrix,
<span class="math notranslate nohighlight">\(b\)</span> an input vector and <span class="math notranslate nohighlight">\(x\)</span> the solution vector. Different versions of the CG inverter are implemented in our code
in the two classes <code class="docutils literal notranslate"><span class="pre">ConjugateGradient</span></code> and <code class="docutils literal notranslate"><span class="pre">AdvancedMultiShiftCG</span></code>. The matrix <span class="math notranslate nohighlight">\(A\)</span> in our case is, up to now, always <span class="math notranslate nohighlight">\(M^{\dagger}M\)</span>,
with <span class="math notranslate nohighlight">\(M\)</span> being the fermion matrix.</p>
<section id="the-basic-algorithm">
<h2>The Basic Algorithm<a class="headerlink" href="#the-basic-algorithm" title="Permalink to this headline"></a></h2>
<p>Given an initial guess <span class="math notranslate nohighlight">\(x_0\)</span> and a target residual <span class="math notranslate nohighlight">\(\epsilon\)</span>, compute <span class="math notranslate nohighlight">\(r_0=b-Ax\)</span> and set <span class="math notranslate nohighlight">\(p_0=r_0\)</span>.</p>
<blockquote>
<div><p><strong>while</strong> <span class="math notranslate nohighlight">\(|r_{i+1}|^2 &lt; \epsilon\)</span> <br />
<span class="math notranslate nohighlight">\(\qquad \alpha_{i} = \frac{|r_i|^2}{p^{\dagger}Ap}\)</span><br />
<span class="math notranslate nohighlight">\(\qquad x_{i+1} = x_{i} + \alpha_{i}p_{i}\)</span><br />
<span class="math notranslate nohighlight">\(\qquad r_{i+1} = r_{i} - \alpha_{i}r_{i}\)</span><br />
<span class="math notranslate nohighlight">\(\qquad \beta_{i} = \frac{|r_{i+1}|^2}{|r_i|^2}\)</span><br />
<span class="math notranslate nohighlight">\(\qquad p_{i+1} = r_{i+1} + \beta_{i}p_{i}\)</span><br />
<span class="math notranslate nohighlight">\(\qquad i = i+1\)</span><br />
<strong>end while</strong><br />
<strong>return</strong> <span class="math notranslate nohighlight">\(x_{i+1}\)</span></p>
</div></blockquote>
</section>
<section id="residual-drift-and-precision">
<h2>Residual drift and precision<a class="headerlink" href="#residual-drift-and-precision" title="Permalink to this headline"></a></h2>
<p>In exact arithmetic, the iteratively computed residual <span class="math notranslate nohighlight">\(r_i\)</span> is always equivalent to the true residual <span class="math notranslate nohighlight">\(r=b-Ax_{i}\)</span>. Due to rounding errors in floating point calculations, this is not true in practical use and the iterated residual <span class="math notranslate nohighlight">\(r_i\)</span> will drift away from the true residual <span class="math notranslate nohighlight">\(r=b-Ax_{i}\)</span>. The magnitude of this drift depends on the floating point precision that you are using and the number of iterations the CG needs until convergence. This problem can be resolved by replacing the iterated residual with the exact residual occasionally.</p>
</section>
<section id="multi-rhs-and-multi-shift">
<h2>Multi-RHS and Multi-shift<a class="headerlink" href="#multi-rhs-and-multi-shift" title="Permalink to this headline"></a></h2>
<p>Two important improvements of the basic CG algorithm that we employ in our code are the use of multiple right-hand sides and multiple shifts. These two optimizations cannot be used at the same time.
The multi rhs version of the CG solves <span class="math notranslate nohighlight">\(Ax_{k}=b_{k}\)</span> for multiple “right-hand sides” <span class="math notranslate nohighlight">\(b_k\)</span> simultaneously. This can improve performance significantly since the most expensive part in a CG iteration is the matrix-vector product <span class="math notranslate nohighlight">\(Ap\)</span> that enters <span class="math notranslate nohighlight">\(\alpha\)</span>. The performance of this matrix-vector product, the Dslash kernel, is limited by the GPU’s memory bandwidth. By applying the same matrix <span class="math notranslate nohighlight">\(A\)</span> to multiple vectors at once, <span class="math notranslate nohighlight">\(A\)</span> only needs to be loaded once, thereby saving memory bandwidth and increasing performance.
The multi-shift CG solves <span class="math notranslate nohighlight">\(\left(A+\sigma_{k}\mathbb{1}\right)x_{k}=b\)</span> with multiple shifts <span class="math notranslate nohighlight">\(\sigma_{k}\)</span>. Without going into too much technical details, the multi-shift CG essentially solves the system <span class="math notranslate nohighlight">\(\left(A+\sigma_{0}\mathbb{1}\right)x_0=b\)</span> with the smallest shift <span class="math notranslate nohighlight">\(\sigma_0\)</span> with the basic CG algorithm and computes the solutions for the larger shifts through some smart linear algebra on the way, thereby solving multiple problems at the cost of only one.</p>
</section>
<section id="the-conjugategradient-class">
<h2>The ConjugateGradient Class<a class="headerlink" href="#the-conjugategradient-class" title="Permalink to this headline"></a></h2>
<p>This class implements the CG algorithm shown above and supports using multiple right hand sides.
It has two template parameters, <code class="docutils literal notranslate"><span class="pre">floatT</span></code> and <code class="docutils literal notranslate"><span class="pre">NStacks</span></code> that define the floating point type to be used and the number of right hand sides, respectively.</p>
<div class="highlight-C++ notranslate"><div class="highlight"><pre><span></span><span class="n">ConjugateGradient</span><span class="o">&lt;</span><span class="n">floatT</span><span class="p">,</span><span class="w"> </span><span class="n">NStacks</span><span class="o">&gt;</span><span class="w"> </span><span class="n">cg</span><span class="p">;</span><span class="w"></span>
</pre></div>
</div>
<p>The member functions of ConjugateGradient are different flavors of the basic algorithm listed above. Specifically, we have</p>
<div class="highlight-C++ notranslate"><div class="highlight"><pre><span></span><span class="k">template</span><span class="w"> </span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Spinor_t</span><span class="o">&gt;</span><span class="w"></span>
<span class="kt">void</span><span class="w"> </span><span class="n">invert</span><span class="p">(</span><span class="n">LinearOperator</span><span class="o">&lt;</span><span class="n">Spinor_t</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">dslash</span><span class="p">,</span><span class="w"> </span><span class="n">Spinor_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorOut</span><span class="p">,</span><span class="w"> </span><span class="n">Spinor_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorIn</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">max_iter</span><span class="p">,</span><span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">precision</span><span class="p">);</span><span class="w"></span>
</pre></div>
</div>
<p>which implements a multi-rhs CG with an absolute stopping criterion, i.e. the algorithm stops when <span class="math notranslate nohighlight">\(|r|^2&lt;\mathrm{precision}\)</span>. The first argument <code class="docutils literal notranslate"><span class="pre">LinearOperator&lt;Spinor_t&gt;&amp;</span> <span class="pre">dslash</span></code> can be any class that has the function <code class="docutils literal notranslate"><span class="pre">void</span> <span class="pre">applyMdaggM(Spinor_t&amp;,</span> <span class="pre">Spinor_t&amp;,</span> <span class="pre">bool</span> <span class="pre">update)</span></code> defined. In our case, this will usually be some dslash class. The second and third arguments are output and input <code class="docutils literal notranslate"><span class="pre">Spinorfields</span></code>. <code class="docutils literal notranslate"><span class="pre">max_iter</span></code> is the maximum number of iterations after which the CG exits and <code class="docutils literal notranslate"><span class="pre">precision</span></code> is the target residual used in the stopping criterion.</p>
<div class="highlight-C++ notranslate"><div class="highlight"><pre><span></span><span class="k">template</span><span class="w"> </span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Spinor_t</span><span class="o">&gt;</span><span class="w"></span>
<span class="kt">void</span><span class="w"> </span><span class="n">invert_new</span><span class="p">(</span><span class="n">LinearOperator</span><span class="o">&lt;</span><span class="n">Spinor_t</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">dslash</span><span class="p">,</span><span class="w"> </span><span class="n">Spinor_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorOut</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="n">Spinor_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorIn</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">max_iter</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">precision</span><span class="p">);</span><span class="w"></span>
</pre></div>
</div>
<p>invert_new does the same as invert but uses a relative stopping criterion, i.e. the algorithm stops when <span class="math notranslate nohighlight">\(\frac{|r|^2}{|r_{0}|^2}&lt;\mathrm{precision}\)</span>, where <span class="math notranslate nohighlight">\(r_0\)</span> is the starting residual.</p>
<div class="highlight-C++ notranslate"><div class="highlight"><pre><span></span><span class="k">template</span><span class="w"> </span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Spinor_t</span><span class="o">&gt;</span><span class="w"></span>
<span class="kt">void</span><span class="w"> </span><span class="n">invert_res_replace</span><span class="p">(</span><span class="n">LinearOperator</span><span class="o">&lt;</span><span class="n">Spinor_t</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">dslash</span><span class="p">,</span><span class="w"> </span><span class="n">Spinor_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorOut</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="n">Spinor_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorIn</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">max_iter</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">precision</span><span class="p">);</span><span class="w"></span>
</pre></div>
</div>
<p>invert_res_replace employs a residual replacement strategy where the iterated residual is replaced by the exact residual occasionally.</p>
</section>
<section id="the-advancedmultishiftcg-class">
<h2>The AdvancedMultiShiftCG Class<a class="headerlink" href="#the-advancedmultishiftcg-class" title="Permalink to this headline"></a></h2>
<p>This class implements the multi-shift CG algorithm. The template parameters are the same as for the ConjugateGradient class.</p>
<p>The member function</p>
<div class="highlight-C++ notranslate"><div class="highlight"><pre><span></span><span class="k">template</span><span class="w"> </span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">SpinorIn_t</span><span class="p">,</span><span class="w"> </span><span class="k">typename</span> <span class="nc">SpinorOut_t</span><span class="o">&gt;</span><span class="w"></span>
<span class="kt">void</span><span class="w"> </span><span class="n">invert</span><span class="p">(</span><span class="n">LinearOperator</span><span class="o">&lt;</span><span class="n">SpinorIn_t</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">dslash</span><span class="p">,</span><span class="w"> </span><span class="n">SpinorOut_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorOut</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="n">SpinorIn_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorIn</span><span class="p">,</span><span class="w"> </span><span class="n">SimpleArray</span><span class="o">&lt;</span><span class="n">floatT</span><span class="p">,</span><span class="w"> </span><span class="n">NStacks</span><span class="o">&gt;</span><span class="w"> </span><span class="n">sigma</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">max_iter</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">precision</span><span class="p">);</span><span class="w"></span>
</pre></div>
</div>
<p>performs the multi-shift inversion. In contrast to ConjugateGradient::invert, this function has two template parameters, one for the input spinor type and one for the output spinor type. The input spinor is a <code class="docutils literal notranslate"><span class="pre">Spinorfield</span></code> with <code class="docutils literal notranslate"><span class="pre">NStacks=1</span></code> and the output spinor is a <code class="docutils literal notranslate"><span class="pre">Spinorfield</span></code> where <code class="docutils literal notranslate"><span class="pre">NStacks</span></code> matches the number of shifts. The shifts are passed to the function via <code class="docutils literal notranslate"><span class="pre">SimpleArray&lt;floatT,</span> <span class="pre">NStacks&gt;</span> <span class="pre">sigma</span></code>.</p>
</section>
<section id="mixed-precision">
<h2>Mixed precision<a class="headerlink" href="#mixed-precision" title="Permalink to this headline"></a></h2>
<p>The time to solution of the CG can be significantly decreased by using mixed precision approaches. Since it is an iterative method, we can use half precision floating point arithmetic for the bulk part of the computation and inject full precision residuals occasionally to correct for rounding errors accumulated along the way.  This is implemented in the member function <code class="docutils literal notranslate"><span class="pre">invert_mixed</span></code> of <code class="docutils literal notranslate"><span class="pre">ConjugateGradient</span></code>:</p>
<div class="highlight-C++ notranslate"><div class="highlight"><pre><span></span><span class="k">template</span><span class="w"> </span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Spinor_t</span><span class="p">,</span><span class="w"> </span><span class="k">typename</span> <span class="nc">Spinor_t_inner</span><span class="o">&gt;</span><span class="w"></span>
<span class="kt">void</span><span class="w"> </span><span class="n">invert_mixed</span><span class="p">(</span><span class="n">LinearOperator</span><span class="o">&lt;</span><span class="n">Spinor_t</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">dslash</span><span class="p">,</span><span class="w"> </span><span class="n">LinearOperator</span><span class="o">&lt;</span><span class="n">Spinor_t_inner</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">dslash_inner</span><span class="p">,</span><span class="w"> </span><span class="n">Spinor_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorOut</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="n">Spinor_t</span><span class="o">&amp;</span><span class="w"> </span><span class="n">spinorIn</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">max_iter</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">precision</span><span class="p">,</span><span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">delta</span><span class="p">);</span><span class="w"></span>
</pre></div>
</div>
<p><code class="docutils literal notranslate"><span class="pre">void</span> <span class="pre">invert_mixed</span></code> recomputes a full precision true residual if the norm of the current residual decreased by a factor <span class="math notranslate nohighlight">\(\frac{1}{\delta}\)</span> compared to the residual from the last restart. It then resets the CG by re-projecting the gradient vector <span class="math notranslate nohighlight">\(p_{i}\)</span> such that it is orthogonal to the new, true residual.</p>
<p>In contrast to the other invert methods, one has to pass an additional, lower precision dslash operator to these methods. An example how to use the mixed precision inverters is given below</p>
<div class="highlight-C++ notranslate"><div class="highlight"><pre><span></span><span class="w"> </span><span class="c1">//declare target precision gauge fields and half precision gauge fields</span>
<span class="w"> </span><span class="n">Gaugefield</span><span class="o">&lt;</span><span class="n">floatT</span><span class="p">,</span><span class="w"> </span><span class="n">onDevice</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepth</span><span class="p">,</span><span class="w"> </span><span class="n">R18</span><span class="o">&gt;</span><span class="w"> </span><span class="n">gauge_smeared</span><span class="p">(</span><span class="n">commBase</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">Gaugefield</span><span class="o">&lt;</span><span class="n">floatT</span><span class="p">,</span><span class="w"> </span><span class="n">onDevice</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepth</span><span class="p">,</span><span class="w"> </span><span class="n">U3R14</span><span class="o">&gt;</span><span class="w"> </span><span class="n">gauge_naik</span><span class="p">(</span><span class="n">commBase</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">Gaugefield</span><span class="o">&lt;</span><span class="n">floatT_inner</span><span class="p">,</span><span class="w"> </span><span class="n">onDevice</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepth</span><span class="p">,</span><span class="w"> </span><span class="n">R18</span><span class="o">&gt;</span><span class="w"> </span><span class="n">gauge_smeared_lowprec</span><span class="p">(</span><span class="n">commBase</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">Gaugefield</span><span class="o">&lt;</span><span class="n">floatT_inner</span><span class="p">,</span><span class="w"> </span><span class="n">onDevice</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepth</span><span class="p">,</span><span class="w"> </span><span class="n">U3R14</span><span class="o">&gt;</span><span class="w"> </span><span class="n">gauge_naik_lowprec</span><span class="p">(</span><span class="n">commBase</span><span class="p">);</span><span class="w"></span>

<span class="c1">//declare spinors</span>
<span class="n">Spinorfield</span><span class="o">&lt;</span><span class="n">floatT</span><span class="p">,</span><span class="w"> </span><span class="n">onDevice</span><span class="p">,</span><span class="w"> </span><span class="n">LatLayout</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepthSpin</span><span class="p">,</span><span class="w"> </span><span class="n">NStacks</span><span class="o">&gt;</span><span class="w"> </span><span class="n">spinorIn</span><span class="p">(</span><span class="n">commBase</span><span class="p">);</span><span class="w"></span>
<span class="n">Spinorfield</span><span class="o">&lt;</span><span class="n">floatT</span><span class="p">,</span><span class="w"> </span><span class="n">onDevice</span><span class="p">,</span><span class="w"> </span><span class="n">LatLayout</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepthSpin</span><span class="p">,</span><span class="w"> </span><span class="n">NStacks</span><span class="o">&gt;</span><span class="w"> </span><span class="n">spinorOut</span><span class="p">(</span><span class="n">commBase</span><span class="p">);</span><span class="w"></span>


<span class="c1">//declare CG, regular dslash and half precision dslash</span>
<span class="n">ConjugateGradient</span><span class="o">&lt;</span><span class="n">floatT</span><span class="p">,</span><span class="w"> </span><span class="n">NStacks</span><span class="o">&gt;</span><span class="w"> </span><span class="n">cg</span><span class="p">;</span><span class="w"></span>

<span class="n">HisqDSlash</span><span class="o">&lt;</span><span class="n">floatT</span><span class="p">,</span><span class="w"> </span><span class="n">onDevice</span><span class="p">,</span><span class="w"> </span><span class="n">LatLayout</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepth</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepthSpin</span><span class="p">,</span><span class="w"> </span><span class="n">NStacks</span><span class="o">&gt;</span><span class="w"> </span><span class="n">dslash</span><span class="p">(</span><span class="n">gauge_smeared</span><span class="p">,</span><span class="w"> </span><span class="n">gauge_naik</span><span class="p">,</span><span class="w"> </span><span class="n">param</span><span class="p">.</span><span class="n">m_ud</span><span class="p">());</span><span class="w"></span>
<span class="n">HisqDSlash</span><span class="o">&lt;</span><span class="n">floatT_inner</span><span class="p">,</span><span class="w"> </span><span class="n">onDevice</span><span class="p">,</span><span class="w"> </span><span class="n">LatLayout</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepth</span><span class="p">,</span><span class="w"> </span><span class="n">HaloDepthSpin</span><span class="p">,</span><span class="w"> </span><span class="n">NStacks</span><span class="o">&gt;</span><span class="w"> </span><span class="n">dslash_lowprec</span><span class="p">(</span><span class="n">gauge_smeared_lowprec</span><span class="p">,</span><span class="w"> </span><span class="n">gauge_naik_lowprec</span><span class="p">,</span><span class="w"> </span><span class="n">param</span><span class="p">.</span><span class="n">m_ud</span><span class="p">());</span><span class="w"></span>



<span class="c1">//do stuff with the regular gaugefield and spinors, like smearing and some physics</span>


<span class="c1">//copy results into half prec gauge fields. THIS STEP IS IMPORTANT!</span>
<span class="w"> </span><span class="n">gauge_smeared_lowprec</span><span class="p">.</span><span class="n">convert_precision</span><span class="p">(</span><span class="n">gauge_smeared</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">gauge_naik_lowprec</span><span class="p">.</span><span class="n">convert_precision</span><span class="p">(</span><span class="n">gauge_naik</span><span class="p">);</span><span class="w"></span>

<span class="c1">//invert using floatT-floatT_inner CG solver</span>
<span class="n">cg</span><span class="p">.</span><span class="n">invert_mixed</span><span class="p">(</span><span class="n">dslash</span><span class="p">,</span><span class="w"> </span><span class="n">dslash_lowprec</span><span class="p">,</span><span class="w"> </span><span class="n">spinorOut</span><span class="p">,</span><span class="w"> </span><span class="n">spinorIn</span><span class="p">,</span><span class="w"> </span><span class="n">param</span><span class="p">.</span><span class="n">cgMax</span><span class="p">(),</span><span class="w"> </span><span class="n">param</span><span class="p">.</span><span class="n">residue</span><span class="p">(),</span><span class="w"> </span><span class="n">param</span><span class="p">.</span><span class="n">cgMixedPrec_delta</span><span class="p">());</span><span class="w"></span>
</pre></div>
</div>
</section>
</section>


           </div>
          </div>
          <footer>

  <hr/>

  <div role="contentinfo">
    <p>&#169; Copyright 2021, LatticeQCD.</p>
  </div>

  Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
    <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
    provided by <a href="https://readthedocs.org">Read the Docs</a>.
   

</footer>
        </div>
      </div>
    </section>
  </div>
  <script>
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script> 

</body>
</html>